301 research outputs found
GNN-encoder: Learning a Dual-encoder Architecture via Graph Neural Networks for Passage Retrieval
Recently, retrieval models based on dense representations are dominant in
passage retrieval tasks, due to their outstanding ability in terms of capturing
semantics of input text compared to the traditional sparse vector space models.
A common practice of dense retrieval models is to exploit a dual-encoder
architecture to represent a query and a passage independently. Though
efficient, such a structure loses interaction between the query-passage pair,
resulting in inferior accuracy. To enhance the performance of dense retrieval
models without loss of efficiency, we propose a GNN-encoder model in which
query (passage) information is fused into passage (query) representations via
graph neural networks that are constructed by queries and their top retrieved
passages. By this means, we maintain a dual-encoder structure, and retain some
interaction information between query-passage pairs in their representations,
which enables us to achieve both efficiency and efficacy in passage retrieval.
Evaluation results indicate that our method significantly outperforms the
existing models on MSMARCO, Natural Questions and TriviaQA datasets, and
achieves the new state-of-the-art on these datasets.Comment: 11 pages, 6 figure
To what extent can control policies influence the epidemic spreading? -- A data-driven analysis based on the first wave of COVID-19
On May 5th, 2023, WHO declared an end to the global COVID-19 public health
emergency, which means a significant transition from global critical emergency
response activities to long-term sustained COVID-19 prevention and control. At
this very moment, we make a comprehensive review on various control policies
taken by 127 countries/territories during the first wave of COVID-19 pandemic
until July 2nd, 2020, and evaluate their impacts on the epidemic dynamics in a
quantitative way through both linear and nonlinear regressions. Through our
analyses, the intrinsic correlations between the strength of control policies
and the dynamical characteristics of COVID-19 epidemics are revealed not only
for every country/territory under consideration, but also in a global view. Our
results may help to design more economical and more effective preventive
measures during the long-term fight against COVID-19 in the future.Comment: 17 pages, 5 figures, 2 table
PreQuant: A Task-agnostic Quantization Approach for Pre-trained Language Models
While transformer-based pre-trained language models (PLMs) have dominated a
number of NLP applications, these models are heavy to deploy and expensive to
use. Therefore, effectively compressing large-scale PLMs becomes an
increasingly important problem. Quantization, which represents high-precision
tensors with low-bit fix-point format, is a viable solution. However, most
existing quantization methods are task-specific, requiring customized training
and quantization with a large number of trainable parameters on each individual
task. Inspired by the observation that the over-parameterization nature of PLMs
makes it possible to freeze most of the parameters during the fine-tuning
stage, in this work, we propose a novel ``quantize before fine-tuning''
framework, PreQuant, that differs from both quantization-aware training and
post-training quantization. PreQuant is compatible with various quantization
strategies, with outlier-aware parameter-efficient fine-tuning incorporated to
correct the induced quantization error. We demonstrate the effectiveness of
PreQuant on the GLUE benchmark using BERT, RoBERTa, and T5. We also provide an
empirical investigation into the workflow of PreQuant, which sheds light on its
efficacy.Comment: Findings of ACL202
- …